Trends in outcomes following COVID-19 symptom onset in Milan: a cohort study

Background For people with symptomatic COVID-19, the relative risks of hospital admission, death without hospital admission and recovery without admission, and the times to those events, are not well understood. We describe how these quantities varied with individual characteristics, and through the first wave of the pandemic, in Milan, Italy. Methods A cohort study of 27 598 people with known COVID-19 symptom onset date in Milan, Italy, testing positive between February and June 2020 and followed up until 17 July 2020. The probabilities of different events, and the times to events, were estimated using a mixture multistate model. Results The risk of death without hospital admission was higher in March and April (for non-care home residents, 6%–8% compared with 2%–3% in other months) and substantially higher for care home residents (22%–29% in March). For all groups, the probabilities of hospitalisation decreased from February to June. The probabilities of hospitalisation also increased with age, and were higher for men, substantially lower for healthcare workers and care home residents, and higher for people with comorbidities. Times to hospitalisation and confirmed recovery also decreased throughout the first wave. Combining these results with our previously developed model for events following hospitalisation, the overall symptomatic case fatality risk was 15.8% (15.4%–16.2%). Conclusions The highest risks of death before hospital admission coincided with periods of severe burden on the healthcare system in Lombardy. Outcomes for care home residents were particularly poor. Outcomes improved as the first wave waned, community healthcare resources were reinforced and testing became more widely available.


National Health System Italian Ministry of Health
The parliament sets the Essential Levels of Care (LEA) and the National Health Fund. Agreements are made between the IMH and the regional governments for the monitoring of LEA-related indicators, objectives for implementation and quality assurance, and budget criteria for fund allocation.

Regional Healthcare System
General Directorate of Welfare of Regione Lombardia 1) Defines budgets, rules, objectives for Local Health Authorities, which are the payers, and for hospital and clinic trusts (called Territorial Health Social Trusts), which are the providers; 2) Coordinates and supports Local Health Authorities; 3) Manages regional "monitoring system" (database) 4) Reports performance indicators to IMH. BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Appendix Table 3. Probabilities of death before hospital admission, after COVID-19 onset, for people over 65 years of age who were not healthcare workers, by patient characteristic and month of onset. Missing outcomes considered as censoring (as in main Figure 1).

Probabilities of hospital admission
Appendix Figure 2. Probability of hospital admission following COVID-19 onset, for people without comorbidities, comparing by month of onset, age group, gender and whether a person is a care home resident, or a healthcare worker, or neither, for models with and without censored data included.
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Full results: times to events Summary of times to observed events Appendix Figure 4. Histograms showing the frequencies (by subgroup) of observed times from COVID-19 onset to alternative events (hospital admission, death without admission or recovery without admission), or time to censoring for those whose next event was unknown.
Appendix Figure 4 illustrates the distribution of the observed times from onset to hospital admission, death without admission and confirmed recovery without admission, and times to the assumed date of censoring (minimum of time to data extraction and 60 days) for people who had none of these events recorded.
Combining all individuals and neglecting censoring, the median time from onset to admission was 6 days (interquartile range 2 to 10 days), the median time to death without admission was 12 days (interquartile range 6 to 22 days) and the median time to confirmed recovery was 41 days (interquartile range 26 to 60 days). Recall that this is defined as the time to the BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Summaries of times to events from the fitted models Histograms of the observed times to events are shown in Appendix Figure 4. From the fitted model, times to admission (Appendix Figure 5) become shorter from February to June (from median 7 days to 1 day for males aged 66 and over who were not care home residents). Slightly longer times to admission were observed for care home residents (from median 10 days in February to 2 days in June). No substantial differences in times to admission were observed between people of different ages and genders, for healthcare workers or for people with comorbidities (see Appendix Figures 5-7 and Tables 7-12). Variability in times to admission became smaller from February (main manuscript Figure 3, 95% quantile interval 1-50 days for women and men over 65) to June (0-10 days) For times to death without hospital admission, no significant covariate effects were observed, other than slightly higher times to death for February onsets (median 22 days (13 to 35) for men and 24 days (15 to 39) for women over 65 in the baseline risk group, compared to 10 days or less in later months). Variability in these times was highest in February (2-123 days for women over 65), but constant in subsequent months (1-54 days).
Shorter times to confirmed recovery (Appendix Figure 7) were estimated as time passed from February to June (median 74 to 19 for people aged 66 and over who were not healthcare workers or care home residents). Shorter times to confirmed recovery were also estimated for people under 65 (median 57 days in February). Variability between patients in time to recovery decreased from February (95% quantile interval 22-175 days for women over 65) to May (6-46 days).
The finding that younger groups appear to recover slightly faster than older groups is in accordance with the literature (Voinsky, Baristaite, and Gurwitz 2020;Castillo et al., 2020), with the difference possibly explained by the progressive decay of the immune system with age. The estimated time to PCR-confirmed recovery for those not hospitalised progressively decreases with calendar time. This may be explained by increasing testing capacity, either due to recovery being confirmed more rapidly, or because of changes in case-mix due to more cases of lower severity being identified and treated outside hospital following testing. Alternatively it may reflect improved capacity for treatment outside hospital as the first wave waned. Times to events compared between subgroups.
Appendix Figure 5. Times from COVID-19 onset to hospital admission (median and range containing 95% of individuals), by month of onset, age and gender, and comparing a baseline group with none of the following risk factors: healthcare workers, care home residents and people with comorbidities, to a group with one of these risk factors.
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)  Figure 7. Times from COVID-19 onset to confirmed recovery (second negative test and no symptoms) (median and range containing 95% of individuals), by month of onset, age and gender, and comparing a baseline group with none of the following risk factors: healthcare workers, care home residents and people with comorbidities, to a group with one of these risk factors.
Appendix Figure 7 shows estimates of times to recovery by subgroups additional to those presented in the main text. Shorter times to confirmed recovery were estimated for people under 65 (median 57 days in February) and for healthcare workers (median 54 days in February to 16 days in June, considering those under 65), and slightly longer times to confirmed recovery for nursing home residents (median 80 days in February to 21 days in June, considering those over 65).
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Times to events following COVID-19 onset (missing outcomes considered as censoring) The following three tables present estimates of the times to the three potential alternative events following COVID-19 onset (hospital admission, death or recovery without admission) corresponding to the figures in the main manuscript, from a model where missing outcomes are considered as censoring.

Details of selected regression models for predictors of outcomes after onset
The multi-state mixture model is composed of two parts: one part determining the event that happens next, and another part determining the time to that event for each potential event.
In the first part, the odds of admission and death (without admission) are defined as different log-linear functions of covariates, where the odds of admission or death is the probability of admission or death respectively divided by the probability of recovery. In the second part, the time to each event was assumed to be distributed as a generalized gamma, a flexible threeparameter distribution that is defined in terms of covariates through an accelerated failure time model (Jackson 2018). In both parts, a "best-fitting" dependence on covariates was determined from a range of choices by minimising Akaike's information criterion. This range included models containing interactions of all other covariates with age and month of onset. The goodness-of-fit of the selected models was verified against the observed data by comparing against nonparametric estimates (Appendix Figures 8-10).
The tables below have one row showing the log odds ratio (and standard error) relating to each main effect or interaction term in the selected regression model, thus showing which terms define the model. These are presented here for technical completeness and to permit comparison of the models with and without censored data. (Note that a clearer presentation of the results of the regression model is given by comparing the absolute probabilities of outcomes between different groups, as in e.g. Appendix Tables 3,4).
The log odds ratios are interpreted as follows. A "baseline" group is defined as age 45, female, onset in March, no comorbidities, not a care home resident or healthcare worker. The log odds ratio comparing a group of interest with the baseline group can be computed by summing each term relating to the group, e.g. the log odds ratio for a healthcare worker aged 66+, relative to the baseline group, should be computed by adding the "Healthcare worker" main effect to the "Healthcare worker, age 66+" interaction term.

Excluding censoring
Including censoring

Goodness of fit of selected parametric models
The overall fit of the parametric assumptions of the multi-state model is checked by comparing predictions of the cumulative incidence probability of events from the parametric model with estimates from the nonparametric Aalen-Johansen method. These parametric assumptions include the form of the distribution for the time to the next event after onset, the selection of covariates that affect the parameters of this distribution, and the selection of covariates that affect the probability governing which of the next event happens. These checks are illustrated in Appendix Figures 8-10. The plots also compare the AIC-selected models fitted to the data (a) with the missing outcomes ignored, and (b) with the missing outcomes considered as censoring.
The parametric estimates only deviate from the nonparametric estimates in cases where the amount of data informing the nonparametric estimate is small, thus the nonparametric estimate is unreliable. This includes the probability of admission versus recovery beyond 20 days after onset for males aged 45 and below with onset in February (bottom row of Appendix Figure 8), where there were 46 admissions for people in this group, and only three of these beyond 20 days after onset. In Appendix Figure 9, while the fit appears to be worst for healthcare workers aged over 65 with onset in March, and care home residents under 65 with onset in June, these categories included only 33 and 3 observed cases respectively.
In Appendix Figure 10, note that the nonparametric estimates could not be calculated for certain combinations of predictors where there were no or very few observations (e.g. onset in June for people with comorbidities). In these cases, the parametric models allow prediction through the assumption of additivity of terms of the regression model. Appendix Figure 11 shows the fit of modelled densities to data. This shows parametrically modelled densities for a "baseline" category defined by female, no comorbidities, not a care home resident or healthcare worker, and compares three age groups, for the three different competing events following onset. The model accounting for censoring is compared with the model that neglects the censored data. For the times from onset to hospital admission and death, these two models agree, since if a patient had not died or been admitted to hospital by the end of their follow-up period, they are judged likely to recover. Hence there is negligible censoring of admission or death events, and some censoring of recovery events. Therefore for the event of recovery, the model that accounts for censoring predicts slightly longer times to events, and we would judge that the model more accurately reflects the true distribution for this event.
Supplementary analysis with a finer age categorisation An additional model was developed to investigate the relation of age and care home residency to mortality before hospital admission in more detail. For this model, a finer age grouping was defined by splitting the single over-65s group into three categories: ages 65-74, ages 75-84 and ages 85 and over. A logistic regression model was fitted to the uncensored data. Selected covariates included age group, month of admission, gender, care home residency, comorbidities, and an interaction of gender and care home residency with age group and month of admission. Estimated probabilities of death before admission, by age group, gender and care home residency, for a baseline group defined by people BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)  Table 18. Estimated probabilities of death before hospital admission by age group (using a finer classification) and care home residency, for people without comorbidities and onset in March 2020.
Appendix Figure 8. Predicted cumulative incidence of next events following COVID-19 onset, for the selected parametric models with and without censoring, compared to Aalen-Johansen nonparametric estimates to check goodness of fit. By month of onset, age and gender.
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s)